Information processing device and information processing method

ABSTRACT

An information processing device includes: a first reception unit configured to receive an input of one or more characters; a second reception unit configured to receive an input of voice; and a voice recognition unit configured to recognize the voice, and output a voice recognition result beginning with the one or more characters entered into the first reception unit when the second reception unit receives the input of voice with the input of the one or more characters received by the first reception unit.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to Japanese Patent Application No.2018-015434 filed on Jan. 31, 2018, including the specification,incorporated herein by reference in its entirety.

BACKGROUND 1. Technical Field

The present disclosure relates to an information processing device andan information processing method.

2. Description of Related Art

There is known a voice recognition device that selects a hierarchy levelof the menu of a navigation device according to the content of a user'sinput to a touchpad and recognizes voices using the voice recognitiondictionary prepared for that hierarchy level (for example, refer toJapanese Patent Application Publication No. 2007-240688 (JP 2007-240688A)).

SUMMARY

In some cases, after entering desired characters using a method otherthan voice recognition, the user may quit entering characters in themiddle and then re-enters characters using voice recognition. In thiscase, there is room for improving the voice recognition accuracy.

The present disclosure provides an information processing device and aninformation processing method that improve the voice recognitionaccuracy.

A first aspect of the disclosure provides an information processingdevice. The information processing device includes: a first receptionunit configured to receive an input of one or more characters; a secondreception unit configured to receive an input of voice; and a voicerecognition unit configured to recognize the voice, and output a voicerecognition result beginning with the one or more characters enteredinto the first reception unit when the second reception unit receivesthe input of voice with the input of the one or more characters receivedby the first reception unit.

According to this aspect, when an input of voice is received with aninput of characters received, the information processing device outputsvoice recognition results beginning with the characters entered into thefirst reception unit, improving the voice recognition accuracy.

In the first aspect, the information processing device may include astorage unit configured to store a voice recognition dictionaryincluding a plurality of words. The voice recognition unit may beconfigured to select a specific word beginning with the one or morecharacters entered into the first reception unit from the plurality ofwords included in the voice recognition dictionary, and output thespecific word as the voice recognition result.

In the first aspect, the voice recognition unit may be configured tooutput the voice recognition result independently of the one or morecharacters entered into the first reception unit, when the voicerecognition result beginning with the one or more characters enteredinto the first reception unit is not obtained.

In the first aspect, the voice recognition unit may be configured to,when the first reception unit receives an input of a plurality ofcharacters and when the voice recognition result beginning with theplurality of characters is not obtained, output the voice recognitionresult beginning with a predetermined part of characters of theplurality of characters.

In the first aspect, the predetermined part of characters may becharacters excluding a last character of the plurality of characters.

A second aspect of the disclosure provides an information processingmethod. The information processing method includes: receiving an inputof one or more characters; receiving an input of voice; recognizing thevoice; and when the input of voice is received with the one or morecharacters received, outputting a voice recognition result beginningwith the one or more characters.

According to the present disclosure, the voice recognition accuracy isimproved.

BRIEF DESCRIPTION OF THE DRAWINGS

Features, advantages, and technical and industrial significance ofexemplary embodiments of the disclosure will be described below withreference to the accompanying drawings, in which like numerals denotelike elements, and wherein:

FIG. 1 is a block diagram showing a configuration of an in-vehiclesystem according to a first embodiment; and

FIG. 2 is a flowchart showing the processing of the voice recognitiondevice shown in FIG. 1.

DETAILED DESCRIPTION First Embodiment

FIG. 1 is a block diagram showing a configuration of an in-vehiclesystem 1 according to a first embodiment. The in-vehicle system 1 ismounted on a vehicle. The in-vehicle system 1 includes an input device10, an in-vehicle device 12, a microphone 14, a voice recognition switch16, and a voice recognition device 18.

The input device 10 is a device for entering characters in response to auser operation. The input device 10 is, for example, a touchpad providedon the center console between the driver's seat and the passenger seatof a vehicle for allowing the user to perform a touch operation. Theinput device 10 outputs the operation signal, generated by a touchoperation, to the in-vehicle device 12. With his or her wrist on thecenter console, the user performs the touch operation on the inputdevice 10 while watching the display unit (not shown) on the dashboardbut without looking at the input device 10. For example, the user slideshis or her finger or taps the screen for entering characters or forselecting an in-vehicle function.

The input device 10 may be a touch panel provided on the display surfaceof the display unit for allowing the user to touch the panel or may beanother input device on which the user can perform an operation to entercharacters.

The in-vehicle device 12 causes the display unit to display an imagerelated to the operation of the user based on the operation signalsupplied from the input device 10. The in-vehicle device 12 is, forexample, a car navigation device though not limited to it. For example,when setting a destination on the in-vehicle device 12 to obtain routeguidance, the user slides his or her finger on the input device 10 tomove the cursor displayed on the display unit. Then, when the user tapsa position corresponding to one of a plurality of characters displayedon the display unit, the character is entered into the in-vehicle device12. A plurality of characters are entered in this way to set thedestination. Characters are entered in many situations, for example,when a personal name is entered to search for a telephone number. Thein-vehicle device 12 outputs the entered character data to the voicerecognition device 18.

In some cases, after manually entering some of the desired characters onthe input device 10, the user may quit entering characters in themiddle. For example, the user may quit entering characters because theuser feels it difficult or troublesome to enter characters on the inputdevice 10 or because the driving situation has changed. After quittingentering characters, the user re-enters characters through voicerecognition. For example, when the characters to be entered are “WanwanPark”, the user enters “Wa” or “Wan” via the input device 10 and, then,quits entering characters and speaks “Wanwan park”.

The microphone 14, provided in the vehicle interior, acquires the speechof the occupant of the vehicle and outputs the voice data to the voicerecognition device 18.

The voice recognition switch 16, a switch operated by the user when theuser desires voice recognition, is provided, for example, in thesteering wheel. The voice recognition switch 16 is a pushbutton type orlever type mechanical switch. When the user presses this switch, theoperation signal is output to the voice recognition device 18.

The voice recognition device 18 includes a first reception unit 30, asecond reception unit 32, a third reception unit 34, a storage unit 36,and a voice recognition unit 38. The voice recognition device 18functions as an information processing device.

The first reception unit 30 receives the input of characters from thein-vehicle device 12 and outputs the characters to the voice recognitionunit 38.

The third reception unit 34 receives the operation signal from the voicerecognition switch 16. Upon receipt of the operation signal, the thirdreception unit 34 outputs the voice recognition instruction to thesecond reception unit 32.

When the voice recognition instruction is output from the thirdreception unit 34, the second reception unit 32 receives the input of avoice from the microphone 14 for a predetermined period and outputs thevoice to the voice recognition unit 38.

The storage unit 36 stores a voice recognition dictionary that includesa plurality of words. The plurality of words includes, for example,place names that may be set as the destination in the car navigationdevice and proper names such as person names registered in the telephonedirectory. The storage unit 36 may be provided in the in-vehicle device12.

The voice recognition unit 38 recognizes a voice using a known techniquewhen the second reception unit 32 receives an input of voice with nocharacter received by the first reception unit 30, and outputs voicerecognition results to the in-vehicle device 12. More specifically, froma plurality of words stored in the voice recognition dictionary, thevoice recognition unit 38 selects words with a high degree of matchingwith the recognized character string and outputs the selected words tothe in-vehicle device 12 as the voice recognition results.

A high degree of matching with the recognized character string meansthat the reliability is high. The reliability indicates the degree ofpossibility that a word is correctly recognized from voice data. Thehigher the reliability, the higher the possibility that the word isrecognized correctly. The voice recognition unit 38 outputs one or morevoice recognition results whose reliability is a predetermined value orhigher. The predetermined value can be appropriately set by experiment.

On the other hand, when the second reception unit 32 receives an inputof voice with one or more characters entered into the first receptionunit 30, the voice recognition unit 38 recognizes the voice and outputsvoice recognition results, beginning with the characters entered intothe first reception unit 30, to the in-vehicle device 12. Morespecifically, the voice recognition unit 38 selects words, beginningwith the characters entered into the first reception unit 30 and havinga high degree of matching with the recognized character string, from theplurality of words stored in the voice recognition dictionary and thenoutputs the words, selected in this manner, as the voice recognitionresults. The voice recognition unit 38 outputs one or more voicerecognition results, whose reliability is a predetermined value orhigher, to the in-vehicle device 12.

For example, assume that the characters the user wants to enter are“Wanwan park”, that the characters entered into the first reception unit30 are “Wan”, that the recognized character string is “Wanwan park”, andthat “Wanwan park” is included in the voice recognition dictionary. Inthis case, “Wanwan park” is output as the voice recognition result. Thecharacters “ . . . park” beginning with characters other than “Wan” suchas “Daiichi park”, if included in the voice recognition dictionary, arenot output as the voice recognition result. In addition, the charactersbeginning with “Wan” such as “Wan hotel”, if included in the voicerecognition dictionary, are not output as the voice recognition resultif its reliability is low.

If voice recognition results beginning with the characters entered intothe first reception unit 30 are not obtained, that is, if voicerecognition results beginning with the entered characters and having areliability level equal to or higher than the predetermined value arenot obtained, the voice recognition unit 38 outputs voice recognitionresults independently of the characters entered into the first receptionunit 30. That is, in this case, the voice recognition unit 38 selectswords, having a high degree of matching with the recognized characterstring, from the plurality of words stored in the voice recognitiondictionary, and outputs them as the voice recognition results. In such acase, the characters entered into the first reception unit 30 may bewrong.

The in-vehicle device 12 causes the display unit to display one or morevoice recognition results, output from the voice recognition unit 38, toallow the user to select one of them. To select one of them, the usertouches one of the voice recognition results on the input device 10. Asa result, the selected character string, which is one of the voicerecognition results, is set in the in-vehicle device 12.

This configuration can be implemented by the CPU, memory, and other LSIsof a computer on a hardware basis, and by a program loaded into thememory on a software basis. The above example shows the functionalblocks that are implemented by cooperation between hardware andsoftware. Therefore, it is understood by those skilled in the art thatthese functional blocks can be implemented in various forms by hardwareonly, by software only, or by a combination of hardware and software.

Next, the overall operation of the in-vehicle system 1 with the aboveconfiguration will be described. FIG. 2 is a flowchart showing theprocessing of the voice recognition device 18 shown in FIG. 1. Theprocessing in FIG. 2 is repeated periodically.

If the voice recognition instruction is not received (N in S10), theprocessing is terminated. If the voice recognition instruction isreceived (Y in S10), the second reception unit 32 receives a voice inputfrom the microphone 14 (S12). If the first reception unit 30 has notreceived characters (N in S14), the voice recognition unit 38 recognizesthe voice (S16), outputs the voice recognition results (S18), andterminates the processing.

On the other hand, if the first reception unit 30 has receivedcharacters (Y in S14), the voice recognition unit 38 recognizes thevoice while referring to the received characters (S20). If voicerecognition results beginning with the received characters are obtained(Y in S22), the voice recognition unit 38 outputs the obtained voicerecognition results (S24) and terminates the processing. If voicerecognition results beginning with the received characters are notobtained (N in S22), the voice recognition unit 38 outputs voicerecognition results independently of the received characters (S26) andterminates the processing.

According to this embodiment, if an input of voice is received with aninput of characters received, the voice recognition device 18 outputsvoice recognition results beginning with the received characters, thusimproving the voice recognition accuracy while referring to the receivedcharacters.

In addition, the voice recognition device 18 selects words, beginningwith the received characters, from the plurality of words stored in thevoice recognition dictionary and outputs the selected words as voicerecognition results, further improving the voice recognition accuracy.

In addition, if voice recognition results beginning with the enteredcharacters are not obtained, the voice recognition device 18 outputsvoice recognition results independently of the characters entered intothe first reception unit 30. This makes it possible to take anappropriate action if characters are erroneously entered, providingbetter voice recognition results.

Second Embodiment

A second embodiment differs from the first embodiment in that, if voicerecognition results beginning with a plurality of characters enteredinto the first reception unit 30 are not obtained, the voice recognitionunit 38 outputs voice recognition results beginning with a predeterminedpart of characters of the plurality of characters. The second embodimentwill be described below with emphasis on differences from the firstembodiment.

The configuration of the in-vehicle system 1 in the second embodiment isnot shown since the configuration is the same as that shown in FIG. 1.If the first reception unit 30 receives an input of a plurality ofcharacters and if voice recognition results beginning with the pluralityof characters are not obtained, the voice recognition unit 38 outputsvoice recognition results beginning with a predetermined part ofcharacters of the plurality of characters. The predetermined part ofcharacters is, for example, the plurality of characters excluding thelast one character. That is, the voice recognition unit 38 outputs voicerecognition results beginning with one or more characters of theplurality of characters excluding the last one character.

For example, assume a situation in which, when the characters the userwants to enter are “Wanwan park”, the user has erroneously entered thethird character with the result that the characters entered into thefirst reception unit 30 are “Waw” and, after that, the user uses voicerecognition. In this situation, if the recognized character string is“Wanwan park”, if “Wanwan park” is included in the voice recognitiondictionary, and if a word beginning with “Waw” is not included in thevoice recognition dictionary, voice recognition results are notobtained. In this case, “Wanwan park” beginning with “Wa”, which isgenerated by excluding the last one character “w” from the enteredcharacters “Waw”, is output as the voice recognition result.

If voice recognition results beginning with a predetermined part ofcharacters of the plurality of characters are not obtained, the voicerecognition unit 38 outputs voice recognition results independently ofthe entered characters.

If the first reception unit 30 receives one character and if voicerecognition results beginning with the received character are notobtained, the voice recognition unit 38 outputs voice recognitionresults independently of the received character.

According to this embodiment, if voice recognition results beginningwith a plurality of received characters are not obtained, voicerecognition results beginning with a predetermined part of characters ofthe plurality of characters are output. In this way, this embodimentexcludes a character that is likely to be erroneously entered, thusincreasing the possibility that correct voice recognition results areoutput.

In some cases, when characters are erroneously entered into the inputdevice 10, the user may quit entering characters. In this case, the lastone character of the plurality of characters is likely to be erroneouswith high probability. In this embodiment, the voice recognition unit 38outputs voice recognition results beginning with the charactersexcluding the last one character of the plurality of characters. In thisway, this embodiment excludes the last one character that is likely tohave been erroneously entered, making it possible to output a moreaccurate voice recognition result.

The present disclosure has been described with reference to theembodiments. Note that the embodiments are merely an example. It is tobe understood by those skilled in the art that various modifications arepossible by combining the components and the processing processes andthat such modifications are also within the scope of the presentdisclosure.

For example, in the first embodiment, if voice recognition resultsbeginning with the characters entered into the first reception unit 30are not obtained, the voice recognition unit 38 may notify thein-vehicle device 12 that voice recognition results are not obtained andthen enter the wait state. In the second embodiment, if voicerecognition results beginning with one character entered into the firstreception unit 30 are not obtained or if voice recognition resultsbeginning with a predetermined part of the plurality of charactersentered into the first reception unit 30 are not obtained, the voicerecognition unit 38 may notify the in-vehicle device 12 that the voicerecognition results are not obtained and then enter the wait state. Uponreceiving this notification, the in-vehicle device 12 notifies the user,via voice or image, that the voice could not be recognized. In thesecases, the user is required to operate the voice recognition switch 16and then speak again. When the second reception unit 32 receives aninput of a new voice in response to the new voice recognitioninstruction, the voice recognition unit 38 recognizes the voice andoutputs the voice recognition results independently of the charactersentered into the first reception unit 30. This modification makes itpossible to make the configuration of the voice recognition device 18more flexible.

What is claimed is:
 1. An information processing device comprising: afirst reception unit configured to receive an input of one or morecharacters; a second reception unit configured to receive an input ofvoice; a storage unit configured to store a voice recognition dictionaryincluding a plurality of words; and a voice recognition unit configuredto: recognize the voice, receive the one or more characters from thefirst reception unit, select words beginning with the one or morecharacters entered into the first reception unit from the plurality ofwords included in the voice recognition dictionary, determine a specificword from the words beginning with the one or more characters enteredinto the first reception unit that matches the recognized voice,determine whether the specific word matches the recognized voice with adegree of reliability that is greater than a predetermined value, outputthe specific word as a voice recognition result when the degree ofreliability is greater than the predetermined value, and output a voicerecognition result independently of the one or more characters enteredfrom the first reception unit when the degree of reliability is notgreater than the predetermined value.
 2. The information processingdevice according to claim 1, wherein the voice recognition unit isconfigured to output, when the first reception unit receives an input ofa plurality of characters and when the voice recognition resultbeginning with the plurality of characters is not obtained, the voicerecognition result beginning with a predetermined part of characters ofthe plurality of characters.
 3. The information processing deviceaccording to claim 2, wherein the predetermined part of characters arecharacters excluding a last character of the plurality of characters. 4.An information processing method comprising receiving an input of one ormore characters; receiving an input of voice; recognizing the voice;selecting words beginning with the one or more characters from aplurality of words included in a voice recognition dictionary,determining a specific word from the words beginning with the one ormore characters that matches the recognized voice, determining whetherthe specific word matches the recognized voice with a degree ofreliability that is greater than a predetermined value, outputting, thespecific word as a voice recognition result when the degree ofreliability is greater than the predetermined value, and outputting avoice recognition result independently of the one or more charactersentered from a first reception unit when the degree of reliability isnot greater than the predetermined value.